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Abstract 

We propose a hybrid image-space/object-space solution to 
the classical hidden surface removal problem: Given n 
disjoint triangles in IR and p sample points ("pixels") in 
the xy -plane, determine the first triangle directly behind 
each pixel. Our algorithm constructs the sampled visibility 
map of the triangles with respect to the pixels, which is the 
subset of the trapezoids in a trapezoidal decomposition of 
the analytic visibility map that contain at least one pixel. 
The sampled visibility map adapts to local changes in image 
complexity, and its complexity is bounded both by the 
number of pixels and by the complexity of the analytic 
I I. visibility map. Our algorithm runs in time 0(n ^^ + 
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n'^^t' +p), where t is the output size. This is nearly 
optimal in the worst case and compares favorably with the 
best output-sensitive algorithms for both ray casting and 
analytic hidden surface removal. In the special case where 
the pixels form a regular grid, a sweepline variant of our 
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algorithm runs in time O (n +n 

is usually sublinear in the number of pixels. 
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1 Introduction 

Hidden surface removal is one of the oldest and most 
important problems in computer graphics. Informally, 
the problem is to compute the portions of a given 
collection of geometric objects, typically composed of 
triangles, that are visible from a given camera position 
and orientation in IR . In order to simplify calculation 
(and explanation), a projective transformation is ap- 
plied so that the camera is at — oo on the z-axis and all 
vertices have positive z-coordinates, so that the desired 
image is the orthographic projection of the objects onto 
the xy-plane. We will follow the computer graphics 
convention that the ij-axis is vertical, the x- and z-axes 
are horizontal, and the positive z-axis points into the 
image, directly away from the camera. 

Historically, there are two different approaches to 
solving the hidden surface removal problem: object 
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space and image space [46]. Object-space (or analytic) 
hidden surface removal algorithms compute which ob- 
ject is visible at every point in the image plane. Image- 
space algorithms, on the other hand, compute only the 
object visible at a finite number of sample points. We 
will refer to the sample points themselves as "pixels", 
since usually there is one sample point per pixel in 
the final finite-resolution output image. (Image-space 
algorithms that compute sub-pixel features do so by 
sampling a small constant number of points within each 
pixel area [20].) 

The output of an object-space hidden surface removal 
algorithm is the projection of the forward envelope-' 
of the objects onto the image plane. The resulting 
planar decomposition is called the visibility map of the 
objects. Each face of the visibility map is a maximal 
connected region in which a particular triangle, or no 
triangle, is visible. McKenna [38] described the first 
algorithm to compute visibility maps in 0(n^) time, 
where n is the number of input triangles; see also [15]. 
This is optimal in the worst-case. Unfortunately, 
McKenna's algorithm always uses 0(n^) time and 
space, even when the visibility map is much simpler. 
This shortcoming led to the development of several 
output-sensitive algorithms, whose running time de- 
pends not only on n, the number of triangles, but also 
on V, the number of vertices of the visibility map. The 
fastest algorithm currently known, an improvement by 
Agarwal and Matousek [2] of an algorithm of de Berg 
et al. [6], runs in time 0(n^+'^ +n^/^+'^v^/^). For more 
details on these and other object-space algorithms, see 
the comprehensive survey by Dorward [17]. 

The primary disadvantage of the object-space ap- 
proach is the potentially high complexity of the visibil- 
ity map, which may be much larger than the number of 
pixels in the desired output image, even for reasonable 
input sizes. Even when the visibility map is not overly 
complex, it may contain features that are significantly 
smaller than the area of a pixel and thus do not con- 
tribute to the final image. This is especially problematic 
for applications of hidden-surface removal such as form- 
factor calculation, where the desired output image may 
have very low resolution [45]. 

^This would be called the "lower envelope" if the z-axis were 
vertical. 
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For image-space algorithms, on the other hand, the 
ultimate goal is to compute, for each pixel in the finite- 
resolution output image, which triangle is visible at 
that pixel. The most common image-space approach 
is the z-buffer algorithm introduced by Catmull [9]. 
This algorithm loops through the triangles, determining 
the pixels that each triangle covers in the image plane; 
each pixel maintains the smallest z-coordinate of any 
triangle covering that pixel. While this algorithm can be 
implemented cheaply in hardware, it can still be quite 
slow when the number of triangles and number of pixels 
are both large. 

Another common image-space approach is ray cast- 
ing (also known as ray tracing and ray shooting): 
Shoot a ray from each pixel in the positive z-direction 
and compute the first triangle it hits. Using using the 
best known unidirectional ray-shooting data structure, 
due to Agarwal and Sharir [3], we obtain an algorithm 
with running time O ( (n+n^/^p^/^ +p ) log n) , where n 
is the number of triangles and p is the number of pixels. 
Erickson's lower bound for Hopcroft's problem [18] 
suggests that this algorithm is close to optimal in the 
worst case, even for the simpler problem of deciding 
whether any ray hits a triangle. In practice, ray- 
shooting queries are answered by walking through a 
decomposition of space determined by the triangles, 
such as an octtree [21], triangulation [4], or binary space 
partition [40, 42]. See [4, 29] for related theoretical 
results. 

Neither z-buffers nor ray casting exploit spatial co- 
herence in the image. If the visible triangles are 
fairly large, then the same triangle is likely to be 
visible through several pixels; however, both algorithms 
compute the triangle behind each pixel independently. 
Spatial coherence is exploited to some extent by more 
complex techniques such as Warnock's subdivision al- 
gorithm [49], hierarchical z-buffers [23], hierarchical 
coverage masks [24], and frustum casting [48], which 
construct a recursive quadtree-like decomposition of 
the image. However, this decomposition can be much 
more complex than the visibility map if, for example, 
the image contains several long diagonal lines. In 
particular, if the pixels lie in a regular y/p x ^/p grid, 
the decomposition can have complexity Q(Vy/p]. 

A few hidden surface removal algorithms work si- 
multaneously in both image and object space [28, 50]. 
The basic idea for these algorithms is to traverse the 
objects in order from front to back {i.e., by increasing 
"distance" from the camera), decomposing the image 
plane using the boundaries of the objects and reverting 
to ray casting when any region of the image plane 



contains only a single pixel. Of course, there are sets 
of triangles do not have a consistent depth order, and 
these algorithms will produce incorrect output if such 
as set is given as input. While a depth order can always 
be guaranteed by first decomposing the triangles with 
a binary-space partition tree, this could produce O(n^) 
triangle fragments in the worst case [42] . One exception 
to the depth-order requirement is Weiler and Atherton's 
algorithm [50], which decomposes the image plane into 
regions within which the triangles can be depth-ordered; 
this algorithm can also produce a quadratic number 
of fragments. The image decompositions produced by 
these algorithms produce cannot be analyzed either in 
terms of the complexity of the visibility map, since they 
can decompose triangles even when all depth cycles are 
invisible, or in terms of the number of pixels, since they 
can produce many fragments that do not contain a pixel 
at all. 

In this paper, we propose another hybrid approach 
to hidden surface removal that exploits both spatial 
coherence and finite precision. In Section 2, we define 
the sampled visibility map of a set of triangles with re- 
spect to a set of pixels. Like other image-decomposition 
schemes, the sampled visibility map adapts to local 
changes in the image complexity, but unlike previous 
approaches its complexity is easily bounded both by 
the complexity of the analytic visibility map and by 
the number of pixels. 

We describe an output-sensitive algorithm to con- 
struct the sampled visibility map in Section 4. Our al- 
gorithm runs in time 0(n^+'^ 
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■n^'^^n^/-^ +p), where 
t is the number of trapezoids in the output. This 
matches the performance of Agarwal and Matousek's 
visibility map algorithm when t — Q(v), and almost 
matches Agarwal and Sharir's ray-casting algorithm 
when t — 0(p). Our algorithm does not require the 
triangles to have a consistent depth order, nor does it 
decompose the triangles into orderable fragments. A 
variant of our algorithm allows a sequence of pixels to 
be specified online, at an additional amortized cost of 
O(logt) time per pixel. 

The algorithms presented in Section 4 assume that 
the pixels are just arbitrary points in the xy-plane. 
In Section 5, we describe a faster algorithm for the 
common special case where the pixels are the vertices of 
a rectangular grid. The running time of our improved 
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tlogp), which is 



sublinear in the number of pixels unless the output is 
very large. 

Finally, in Section 6, we discuss some other applica- 
tions of our techniques and suggest directions for further 
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research. 

2 Definitions 

Let A be a set of n disjoint triangles in IR , where 
every vertex has positive z-coordinate. We say that a 
triangle A £ A is visible at a point n in the xy-plane 
if a ray from n in the positive z-direction hits A before 
any other triangle in A. The visibility map Vis (A) 
is a planar straight-line graph, each face of which is a 
maximal connected region in which a particular triangle 
in A, or no triangle, is visible. See Figure 1(a). Let v 
denote the number of vertices of Vis (A). 

The trapezoidal decomposition of Vis (A), denoted 
Trap (Vis (A)), is obtained by decomposing each face 
into (possibly degenerate) trapezoids, two of whose 
edges are vertical (i.e., parallel to the y-axis). The 
vertical edges are defined by casting segments up and/or 
down from each vertex into the face, stopping when the 
segment reaches another edge of the face. Faces are 
decomposed individually, so only one vertical edge is 
added at a "T" vertex where one visible edge appears 
to overlap another. See Figure 1(b). 

Finally, let P be a set of p points in the xy-plane, 
called "pixels". The sampled visibility map of A 
with respect to P, denoted Vis (A | P), is the subset 
of trapezoids in Trap (Vis (A)) that contain at least 
one pixel in P. See Figure 1(d). Let t denote the 
number of trapezoids in Vis(A | P). Clearly t < p, 
since every trapezoid in Vis (A | P) contains at least one 
pixel. Moreover, since Trap (Vis (A)) contains at most 
2v trapezoids, t < 2v. 

3 Building One Trapezoid in Vis (A | P) 

A naive algorithm for constructing the sampled visibil- 
ity map would start by constructing Vis (A). While this 
approach leads to an algorithm that is nearly optimal 
in the worst case, it cannot give an output-sensitive 
algorithm. To obtain output-sensitivity, we construct 
Vis(A I P) one trapezoid at a time. Specifically, for 
each pixel tt G P, if it is unmarked, we determine the 
trapezoid t^ G Trap (Vis (A)] that contains it and then 
mark all the pixels contained in Xn- We construct each 
trapezoid in four stages, which are illustrated in Figure 
2. 

Stage 1. Forward Ray Shooting. The first stage 
in constructing the trapezoid t^ is to determine the 
triangle visible at tt; see Figure 2(a). This is done by 
answering a unidirectional ray-shooting query, exactly 



as in the standard ray-casting algorithm. Agarwal and 
Sharir [3] describe a data structure that can answer such 
queries in time 0((n/\/s] log n) using a data structure 
of size 0(slog n), where s can be chosen anywhere 
between n and n^. The preprocessing time needed to 
construct this data structure is 0(slog n). 

Agarwal and Sharir's data structure is actually de- 
signed to answer point stabbing queries for a set of 
triangles in the plane — How many triangles contain 
the query point? Like most geometric range searching 
structures, their data structure defines a number of 
canonical subsets of the set of triangles. For any point 
7T, the set of triangles that contain tt can be expressed 
as the disjoint union of 0((n/y^)log n) canonical 
subsets; in particular, this implies that the triangles in 
any canonical subset have a common intersection. Their 
data structure stores the size of each canonical subset, 
and a stabbing query is answered by summing up the 
sizes of the relevant canonical subsets. To obtain a 
unidirectional ray-shooting data structure for our three- 
dimensional triangles A, it suffices to build Agarwal and 
Sharir's point-stabbing structure for the xy-projection 
of A. Now the triangles in any canonical subset have 
a consistent front-to-back ordering, and the triangle 
visible through tt can be computed by comparing the 
front-most triangles in the relevant canonical subsets. 

Stage 2. Vertical Ray Dragging. The second stage in 
our algorithm finds the top and bottom edges of t^. 
Intuitively, these edges are computed by dragging the 
ray through tt parallel to the ij-axis until the triangle 
hit by the ray changes. See Figure 2(b). Let A^ G A 
be the triangle visible at tt, and let ft be the point 
on A71 with the same x- and y-coordinates as tt. (To 
avoid the case where no triangle is visible at tt, we can 
assume that there is a large "background" triangle.) Let 
the curtain of a triangle edge be the set of points on 
or directly behind that edge; each curtain is a three- 
sided unbounded polygonal slab, two of whose sides are 
parallel to the z-axis [6]. We can find the top (resp. 
bottom) edge of t^ by shooting a ray from ft along 
the surface of A^t in the positive (resp. negative) y- 
direction. In each case, the desired edge is determined 
either by an edge of A„ or by the first curtain hit by the 
ray. Agarwal and Matousek [2] describe a data structure 
of size 0(srL'^), where s can be chosen anywhere between 
n and n^, that can answer ray shooting queries in a 
set of n curtains in time 0(n^+^/-/s), after 0(sn'^) 
preprocessing time. 

Stage 3. Oblique Ray Dragging. Each vertical trape- 
zoid edge in Trap (Vis ( A) ) is defined either by a vertex of 
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Figure 1. (a) The visibility map Vis (A) of a set A of triangles, (b) its trapezoidal decomposition Trap (Vis (A)), (c) with a grid of pixels 
P, and (d) the resulting sampled visibility map Vls(A | P]. 




Figure 2. Building one trapezoid in Vls(A | P). (a) Shoot a ray into the scene through the pixel to the first triangle, (b) Drag rays up and 
down to find the top and bottom edges, (c) Drag rays along the top and bottom edges to find their (potential) endpoints. (d) Narrow 
the trapezoid by locating the nearest visible vertices on either side. 
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Vis(A) at its top or bottom endpoint, or by a projected 
visible vertex of some triangle, which could lie anywhere 
in the edge. The third stage looks for the nearest 
vertices of Vis (A) along the top and bottom edges of 
T71. Let e and e be triangle edges whose projections lie 
directly above and below tt, respectively, and let ft e e 
and TT e e be the points with the same x-coordinate 
as 7T. Intuitively, we drag rays to the left and right 
along e (resp. e), starting at ft (resp. 7t), stopping when 
each ray either hits another edge or hits an endpoint of 
e (resp. e); see Figure 2(c). Just as in the previous 
stage, each ray-dragging queries can be answered by 
performing a ray-shooting query in the set of curtains in 
time 0(n^+'^/Y^), using Agarwal and Matousek's data 
structure [2]. 

Stage 4. Swath Sweeping. In the final stage, we 
search for the visible triangle vertices whose projections 
lie beneath the top edge and above the bottom edge 
of Ttx, and whose x-coordinates are closest to that of 
the pixel tt. Since we know that Arc is the only triangle 
visible in t„, it suffices to consider only triangle vertices 
in front of the plane containing A^t, and we can assume 
that all such vertices are visible. Intuitively, we take 
the vertical swath of rays swept in Stage 2, and sweep 
it to the left and right until it hits such a vertex. 

We will describe only the leftward sweep; the right- 
ward sweep is completely symmetric. It suffices to build 
a data structure storing only the rightmost vertex of 
each triangle, i.e., the vertex with largest x-coordinate. 
To answer a swath-sweep query, we perform a binary 
search over the x-coordinates of the rightmost vertices, 
looking for the left edge of t^. At each step in 
the binary search, we determine whether a particular 
query trapezoid t contains the projection of any visible 
triangle vertex. Intuitively, at each step, we cast a 
trapezoidal beam forward into the triangles and ask 
whether it encounters any triangle vertex before it hits 
Att. In fact, since the trapezoid t lies entirely inside 
the projection of A^t, it suffices to check whether the 
beam hits a vertex before the plane containing A^. 

We answer this trapezoidal beam query using a 
multi-level data structure. Multi-level data structures 
allow us to decompose complicated queries into simpler 
components and devise independent data structures for 
each component. The size (resp. query time) of a 
multi-level structure is the size (resp. query time) of its 
largest (resp. slowest) component, times an additional 
factor of O(logn) per "level". See [1, 37] for detailed 
descriptions of this standard technique. 

We decompose trapezoidal beam queries by observing 



that the beam through a trapezoid t contains a visible 
vertex v if and only if 

(a) the x-coordinate of v is between the left and right 
x-coordinates of t, 

(b) the xi) -projection of v is below the top edge of t, 

(c) the XI) -projection of v is above the bottom edge 
of T, and 

(d) V is in front of the plane containing A^. 

The first level of our data structure is a range tree [5] 
over the x-coordinates of the triangle vertices, which 
lets us (implicitly) find the vertices between the left 
and right sides of t in O(logn) time. This level requires 
0(n) space and O(nlogn) preprocessing time. 

The next two levels let us (implicitly) find all the 
vertices whose xy-projections lie in the wedge deter- 
mined by the top and bottom edges of t. One level 
finds the points below the top edge; the other finds 
the points above the bottom edge. For each level, we 
can use a two-dimensional halfplane query structure of 
Agarwal and Sharir [3], which answers queries in time 
0((n/-\/s) logn) using space 0(s) and preprocessing 
time 0(slogrL], for any s between n and n^. 

Finally, in the last level, we need to determine 
whether any vertex lies in front of the plane containing 
A„. We can answer this three-dimensional halfspace 
emptiness query in O(logn) time, 0(n) space, and 
0(nlogTi.) preprocessing time using (for example) a 
Dobkin-Kirkpatrick hierarchy [16]. 

Combining all four levels, we obtain a data structure 
of size 0(s log n), with preprocessing time 0(s log n), 
that can answer any trapezoidal beam query in 
time 0({n/ys) log^n), for any n < s < n^. Thus, 
the overall time to answer a swath-sweep query is 
0((n/Vi)logV). 

Putting all four stages together, we obtain the following 
result. The time and space bounds are dominated by 
the curtain ray-shooting data structure in the second 
and third stages. 

Lemma 3.1. Let Abe a set ofn disjoint triangles in IR , 
and let s be a parameter between n and n^. We can 
buiJd a data structure of size O(sn^) in time 0(sti.'^), so 
that for any point n in the xy -plane, we can construct 
the trapezoid t^ G Trap (Vis (A)) containing n in time 
0(n^+'/y/l]. 
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4 All Trapezoids 

4.1 Guessing the Output Size 

Lemma 3.1 implies that for any positive integer t, the 
total time to build our data structure and construct t 
trapezoids is 



O 



tn 



n 



If we know the number of trapezoids in advance, we 
can minimize the total running time by setting s — 
max(n,t^/^n^/^); the resulting time bound is 0(n^+'^ + 

.(.2/3^2/3 + ej 

In our application, however, t is the number of 
trapezoids in Vis (A | P), which is not known in advance. 
We can obtain the same overall running time in this case 
using the following standard doubling trick, previously 
used in several output-sensitive analytic hidden surface 
removal algorithms [41, 3, 6]. Our algorithm runs 
in several phases. In the ith phase, we build the 
data structures from scratch with s = 2^^/^n, and 
then construct the next 2^y/n trapezoids. The time 
for the ith phase is 0(2^^/^n^+'^), and the algorithm 
goes through [log2(t/^/rL)] phases before it builds all t 
trapezoids. 

4.2 Avoiding Redundant Queries 

To construct the entire collection of trapezoids 
Vis(A I P), we loop through the pixels, constructing 
the trapezoid containing each pixel. Of course, if we 
have already built the trapezoid containing a pixel, we 
want to avoid building it again. There are at least two 
methods for avoiding this redundancy. 

In one method, after we construct each new trapezoid, 
we search for and mark all the pixels it contains. This 
can be done in 0((n/-y/s) log n + k) time using a 
two-dimensional range searching data structure similar 
to the one used in the last stage of our trapezoid- 
construction algorithm [3]. Here, s is as usual an 
arbitrary parameter between n and n^, and k is the 
number of pixels marked. Since the leading term is 
dominated by the time to construct the trapezoid in 
the first place, this approach adds only an 0(p) term to 
the overall running time of our hidden-surface removal 
algorithm. 

Theorem 4.1. Let A be a set ofn disjoint triangles in 
IR , and let P be a set ofof'p points in the xy -plane. We 
can construct Vis(A | P) in time 0(n^+^ +t2/3^2/3+£ ^ 
p), where t is the number of trapezoids in Vis (A | P). 



Alternately, before querying a new pixel, we could 
first check whether it is contained in an earlier trapezoid 
by performing a point location query. We can maintain 
a semi-dynamic set of t interior-disjoint vertical trape- 
zoids and answer point-location queries in O(logt) time 
per query and O(logt) amortized time per insertion, 
using a data structure of size O(tlogt) based on a 
segment tree with fractional cascading [10, 11, 39]. 
This approach adds O(plogt) to the overall running 
time of our hidden-surface removal algorithm; the total 
insertion time O(tlogt) is dominated by other terms. 
Although this approach is slower than pixel-marking, it 
can be used when the set of pixels is presented online 
instead of being fixed in advance. 

Theorem 4.2. Let A be a set ofn disjoint triangles in 
IR , and let P be a sequence ofp points in the xy-plane. 
We can maintain Vis(A | P) as points in P are inserted, 
in totaJ timeO(rL^+'=+t2/3n2/3+':+piogt), where t is 
the number of trapezoids in Vis (A | P). 



5 A Faster Sweepline Algorithm 
("Traps and Gaps") 

The algorithms described in the previous section work 
for arbitrary sets of pixels. However, in most appli- 
cations of hidden surface removal, the pixels form a 
regular integer grid. In this case, we can improve the 
performance of our algorithm using the following sweep- 
line approach, suggested by Pavan Desikan and Sariel 
Har-Peled[14]. 

Without loss of generality, we assume that the pixel 
lattice is aligned with the coordinate axes. Our im- 
proved algorithm sweeps a vertical line i across the 
image plane from left to right. At any position, i 
intersects several trapezoids in Vis (A | P). Between any 
pair of such trapezoids is a gap, which is a possibly 
unbounded, possibly empty triangle bounded on the 
left by f , bounded above by the line through the bottom 
edge of the higher trapezoid, and bounded below by the 
line though the top edge of the lower trapezoid. Gaps 
can intersect each other, as well as other trapezoids that 
hit 1 See Figure 3 (a). 

We store the traps and gaps in two data structures: a 
balanced binary search tree and a priority queue. The 
binary tree stores the traps and gaps in sorted order 
from top to bottom along L For the priority queue, 
the priority of a trap is the x-coordinate of its right 
edge, and the priority of a gap is the x-coordinate of 
the leftmost pixel(s) inside the gap, or oo if the gap 
contains no pixels. Since the sweepline clearly crosses 
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Figure 3. (a) Just before and (b) just after the sweepline crosses 
the right edge of a trapezoid and its neighboring gaps are merged. 
Leftmost pixels in each gap, if any, are circled. 



at most t trapezoids, the cost of inserting or deleting a 
trap or gap from the sweep structures is O(logt). Note 
that this is bounded by both O(logp) and O(logn). 

To find the leftmost pixel inside a gap, we use the 
following two-dimensional integer programming result 
of Kanamaru et al. [31]; see also [32, 19]. For related 
results on enumerating integer points in convex poly- 
gons, see [34, 35, 26, 27]. 

Lemma 5.1 (Kanamaru et al. [31]). GiVen a convex 
Ta-gon n, we can find the lowest leftmost integer point 
in n, or determine that TT contains no integer points, in 
time 0{Ta + log6), where 6 is the length of the shortest 
edge of the axis-aligned bounding box of TT. 

Corollary 5.2. We can G.nd a leftmost pixel in any gap, 
or determine that there is no such pixel, in O(logp) 
time. 



the new gap requires O(logp) time, so the total time 
required to kill a single trap is O(logp). 

When (. reaches a leftmost pixel tt in a gap y, 
we perform a trapezoid query to find the trap T^t S 
Vis (A I P) containing n. We then delete y from the 
sweep structure, insert t^, and insert the two smaller 
gaps Y+ and y^ immediately above and below t^. The 
new trap ire may not contain all the leftmost pixels in y; 
any omitted pixels will now be a leftmost pixel in either 
Y+ or y^ . If some new gap contains a leftmost pixel 
of Y, it will be (recursively) filled before the sweepline 
moves again. (We can avoid creating such "transient" 
gaps by storing the highest and lowest leftmost pixels 
in each gap y, at an additional cost of 0(1 ) time when 
y is created, but this improves the running time of our 
algorithm by at most a constant factor.) For each new 
trap inserted, our algorithm spends O(logp) time and 
creates at most two new gaps. 

Every gap except the initial one is created when a trap 
is inserted or deleted. We can charge at most three gaps 
to each trap: the gaps immediately above and below 
when the trap is inserted, and the gap left behind when 
the trap is deleted. The total number of gaps created 
over the entire algorithm is therefore at most 3t + 1 . It 
follows that the total time spent finding leftmost pixels 
is O(tlogp), and the total time spent manipulating the 
sweep structures is O(tlogt). All the remaining time is 
spent on trapezoid queries, as in our earlier algorithms. 

Theorem 5.3. Let A be a set of n disjoint triangles 
in IR , and let P be a regular lattice of p points in 
the xy-plane. We can construct Vis(A | P) in time 
0(n'+'^ _l_ .(-2/3.p^2/3+£ -i-tlogp], where t is the number 
of trapezoids in Vis(A | P) . 

Note that this time bound is sublinear in p unless t = 
D. (p/ log p ) . Moreover, the O (t log p ) term is dominated 
by other terms unless either t is nearly quadratic in n 
or p = 2^'"^ ' for some positive constant c. 



We do not require that the sweepline structures 
always contain every trapezoid in Vis (A | P) that in- 
tersects L Instead we maintain the following weaker 
invariant: whenever I reaches a pixel tt, the trapezoid 
T71 S Vis (A I P) containing n must be stored in the 
sweepline structures. We initialize the sweep structure 
with a single gap that contains the entire pixel grid. 

When the sweepline i reaches the right edge of a trap 
T, we delete it from the sweep structure. We also delete 
the gaps immediately above and below t and insert 
the new larger gap. Manipulating the sweep structure 
requires O(logt) time, and finding a leftmost pixel in 



6 Discussion and Open Problems 

One interesting special case of hidden-surface removal 
is the so-called window rendering problem, where the 
objects are axis-aligned horizontal rectangles. A simple 
modification of our algorithm solves this problem in 
time O (n log n + t log n + p ) which compares favorably 
with the best analytic solutions [8, 22]. If the pixels 
form a regular grid, we can improve the running time 
to 0(nlog n + tlogn) using the sweepline approach. 
(Note that this time bound does not depend at all on the 
number of pixels!) Similar improvements can be made 
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for c-oriented polyhedra [7]. It seems likely that our 
techniques can also be extended to other special cases 
of hidden surface removal with faster analytic solutions, 
such a polyhedral terrains [43] and objects whose union 
has small complexity [33, 25]. 

Perhaps the most interesting open question is 
whether sampled visibility maps, or some other similar 
image decomposition, can be constructed efficiently in 
practice. As we mentioned in the introduction, ray- 
shooting queries are already answered in practice by 
walking through a spatial decomposition defined by the 
input objects. The same spatial decomposition can 
also be used to answer ray-dragging queries [40] and 
trapezoidal beam queries. Since curved models are 
often polygonalized (and complex polyhedral models 
are often simplified) so that each polygonal facet covers 
only a few pixels, a practical implementation may 
require the sampled visibility map to be redefined in 
terms of higher-level objects, such as convex polyhedra 
or algebraic surface patches, instead of triangles. 

A practical implementation of our ideas would have 
other interesting applications. By changing the order 
in which our algorithm processes pixels, we can make it 
suitable for progressive rendering, where the quality of 
the image improves smoothly over time as finer and finer 
details are computed, or foveated rendering, where fine 
details are more important in certain areas of the image 
than others. Another possible application is occlusion 
culling [12, 13, 30, 36, 47]. By sampling the visibility 
map at a small number of random points, we can quickly 
establish a set of simple occluders that can be used 
for conservative visibility tests. The occlusion tests 
themselves would be slightly simpler than in earlier 
approaches: A triangle is invisible if its projection is 
contained in some trapezoid. 

Sampled visibility maps exploit spatial coherence well 
in a global sense; the number of regions is never much 
larger than the size of the visibility map. In a more local 
sense, however, there is clearly room for improvement. 
Consider an image that contains mostly empty space, 
except for a large number of small triangles near the 
boundary. The sampled visibility map consists of 
several tall thin trapezoids, but a better decomposition 
would have a single region covering most of the image. 
It would be interesting to develop decompositions with 
better local behavior — perhaps where the expected 
size of the component containing a random pixel is 
maximized, or where the size of a component is tied 
to the local feature size [44] of the visibility map near 
that component — but with the same global properties 
as sampled visibility maps. 
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